Orthogonal signal projection

ABSTRACT

The invention regards a method and an arrangement for filtering or pre-processing most any type of multivariate data exemplified by NIR or NMR spectra measured on samples in order to remove systematic noise such as base line variation and multiplicative scatter effects. This is accomplished by differentiating the spectra to first or second derivatives, by Multiplicative Signal Correction (MSC), or by similar filtering methods. The pre-processing may, however, also remove information from the spectra, as well as other multiple measurement arrays, regarding (Y) (the response variables). Provided is a variant of PLS that can be used to achieve a signal correction that is as close to orthogonal as possible to a given (y) vector or (Y) matrix. Hence, ensuring that the signal correction removes as little information as possible regarding (Y). A filter according to the present invention is named Orthogonal Partial Least Squares (OPLS).

TECHNICAL FIELD

[0001] The present invention pertains to a method for concentration orproperty calibration of substances or matter and an arrangement forcalibration of spectroscopic input data from samples, wherebyconcentration or property calibration determines a model for furthersamples from the same type.

BACKGROUND OF THE INVENTION

[0002] Multiple measurement vectors and arrays are increasingly beingused for the characterization of solid, semi-solid, fluid and vaporsamples. Examples of methods giving such multiple measurements are NearInfrared Spectroscopy (NIR) and Nuclear Magnetic Resonance (NMR)spectroscopy. Frequently the objective with this characterization is todetermine the value of one or several concentrations in the samples.Multivariate calibration is then used to develop a quantitative relationbetween the digitized spectra, a matrix X, and the concentrations, in amatrix Y, as reviewed by H. Martens and T. Naes, MultivariateCalibration. Wiley, N.Y., 1989. NIR and other spectroscopies arc alsoincreasingly used to infer other properties Y of samples thanconcentrations, e.g., the strength and viscosity of polymers, thethickness of a tablet coating, and the octane number of gasoline.

[0003] The first step of a multivariate calibration is often topre-process input data. The reason is that spectra, as well asother-multiple measurement arrays, often contain systematic variationthat is unrelated to the response y or the responses Y. For solidsamples this systematic variation is due to, among others, lightscattering and differences in spectroscopic path length, and may oftenconstitute the major part of the variation of the sample spectra.

[0004] Another reason for systematic but unwanted variation in thesample spectra may be that the analyte of interest absorbs only in smallparts of the spectral region. The variation in X that is unrelated to Ymay disturb the multivariate modeling and cause imprecise predictionsfor new samples and also affect the robustness of the model over time.

[0005] For the removal of undesirable systematic variation in the data,two types of pre-processing methods are commonly reported in theanalytical chemistry literature, differentiation and signal correction.Popular approaches of signal correction include Savitzky-Golay smoothingby A. Savitzky and M. J. E. Golay, Anal. Chem. 65, 3279-3289 (1993),multiple signal correction (MSC) H. Martens and T. Naes, MultivariateCalibration. Wiley, N.Y., 1989 and P. Geladi, D. MacDougall, and H.Martens, Linearization and Scatter-Correction for Near-InfraredReflectance Spectra of Meat, Applied Spectroscopy, 3 (1985), 491-50,Fourier transformation by P. C. Williams and K. Norris, Near-InfraredTechnology in Agricultural and Food Industries, American CerealAssociation, St. Paul, Minn. (1987), principal components analysis (PCA)by J. Sun, Statistical Analysis of NIR data: Data pretreatment. J.Chemom. 11(1997) 525-532, variable selection H. Martens and T. Naes,Multivariate Calibration. Wiley, N.Y., 1989 and M. Baroni, S. Clementi,G. Cruciani, G. Constantino, and D. Riganelli. Predictive ability ofregression models, Part 2: Selection of the best predictive PLS model.S. Chemom. 6 (1992) 347-56, and base line correction H. Martens and T.Naes, Multivariate Calibration. Wiley, N.Y., 1989 and R. J. Barnes, M.S. Dhanoa, and S. J. Lister. Standard Normal Variate Transformation andDe-trending of Ncar-Infrared Diffuse Reflectance Spectra. Appl.Spectrosc. 43 (1989) 772-777.

[0006] These signal corrections are different cases of filtering, wherea signal (e.g., a Nip spectrum) is made to have “better properties” bypassing it through a filter. The objectives of filtering, often arerather vague; it is not always easy to specify what is meant by “betterproperties”. Even, in the case of calibration, where it is possible tospecify this objective in terms of lowered prediction errors or simplercalibration models, it is difficult to construct general filters thatindeed improve these properties of the data.

[0007] Projections to latent structures by means of partial leastsquares (PLS) is one of the main generalized regression methods foranalyzing multivariate data where a quantitative relationship between adescriptor matrix X and a quality matrix Y is wanted. Multivariatecalibration, classification, discriminant analysis and patternrecognition are to name a few areas where PLS has shown to be a usefultool. The main reasons for its success are because it can cope withcollinearity among variables, noise in both X and Y, moderate amounts ofmissing data in both X and Y, and it can also handle multiple Ysimultaneously. These types of complicated data are now common due tothe advent of analytical instruments such as HPLC, LC-UV, LC-MS, andspectroscopy instruments.

[0008] Improved and modified PLS methods using the so called NIPALSmethod, H. Wold, Nonlinear estimation by iterative least squaresprocedures in F David (Editor), Research Papers in Statistics, Wiley,New York, 1966 pp 411-444, have been suggested since the birth of PLS in1977. A modification of the PLS method is presented. It aims atimproving interpretation of PLS models, reduce model complexity, andimprove predictions and robustness.

[0009] Spectroscopic methods represent a fairly cheap, quick and easyway of retrieving information about samples. In the characterization oforganic substances such as wood, pulp, pharmaceutical tablets, ethanolcontent, etc., near infrared (NIR), NMR, and other instruments haveproven useful.

SUMMARY OF THE DESCRIBED INVENTION

[0010] The present invention sets forth a generic preprocessing methodcalled orthogonal partial least squares (OPLS) for use in multivariatedata analysis (MVA). The concept is to remove variation from X(descriptor variables) that is irrelevant to Y (quality variables, forexample yield, cost or toxicity). In mathematical terms, this isequivalent as removing variation in X that is orthogonal to Y. Earlier,S. Wold, H. Antti, F. Lindgren, J. Öhman, Orthogonal signal correctionof near-infrared spectra, Chemometrics and Intelligent LaboratorySystems, 44 (1998) 175-185, have described the orthogonal signalcorrection (OS C) technique, which has shown to be successful inremoving information in X that is irrelevant to Y. In the presentdescription, a method based on the same criteria, but with differentmeans is disclosed.

[0011] According to the present invention the OPLS method improves thequality of a resulting calibration model regarding prediction ability,model parsimony, and interpretation.

[0012] In order to overcome problems and to achieve purposes, thepresent invention provides a method for concentration or propertycalibration of input data from samples of substances or matter, saidcalibration determining a filter model for further samples of the samesubstance or matter comprising to optionally transform, center, andscale the input data to provide a descriptor set and a concentration orproperty set. The method removes information or systematic variation inthe input data that is not correlated to the concentration or propertyset by providing the steps of:

[0013] producing descriptor weight set, which is normalized, byprojecting the descriptor set on the concentration or property set,projecting the descriptor set on the descriptor weight set, producing adescriptor score set, projecting the descriptor set on the descriptorscore set, producing a descriptor loading set, projecting the propertyset on the descriptor score set, producing a property weight set,projecting the property set on the property weight set producing aproperty score set;

[0014] comparing the descriptor loading set and the descriptor weightset, and their difference, thus obtaining the part of the descriptorloading set that is unrelated to the property set;

[0015] using said difference weight set, normalized, as a starting setfor partial least squares analysis;

[0016] calculating the corresponding orthogonal descriptor score set asthe projection between the descriptor set and said normalized orthogonaldifference weight set, and calculating a corresponding orthogonaldescriptor loading set as the projection of the descriptor set onto theorthogonal descriptor score set;

[0017] removing the outer product of the orthogonal descriptor score setand the orthogonal descriptor loading set from the descriptor set, thusproviding residuals data, which is provided as the descriptor set in anext component;

[0018] repeating the above steps for each orthogonal component;

[0019] the residuals data now being filtered from strong systematicvariation that can be bilinearly modeled as the outer product of theorthogonal descriptor score set and the orthogonal descriptor loadingset, thus providing an orthogonal descriptor set being orthogonal to theproperty set

[0020] optionally providing a principal component analysis (PCA) on theorthogonal descriptor set, producing a bilinear decomposition of theorthogonal descriptor set as the outer product of the principalcomponent analysis score set and the principal component analysisloading set and principal component analysis residuals, adding theprincipal component analysis residuals data back into filtered residualsdata.

[0021] For filtering of new data, the method is proceeding with thefollowing steps:

[0022] projecting a new descriptor set onto the normalized orthogonaldifference weight set, thus producing a new orthogonal descriptor scoreset;

[0023] removing the product between the new orthogonal descriptor scoreset and the orthogonal descriptor loading set from the new descriptorset, thus providing new residuals, which are provided as a newdescriptor set in a next orthogonal component.

[0024] The filtering steps for new data for all estimated orthogonalcomponents are repeated as follows:

[0025] computing a new orthogonal descriptor set as the outer product ofthe new orthogonal descriptor score set and the orthogonal descriptorloading set, computing a new orthogonal principal component score setfrom the projection of the new orthogonal descriptor set onto theprincipal component analysis loading set, whereby the new principalcomponent analysis models residuals formed, are added back into the newresiduals if principal component analysis was used on the orthogonaldescriptor set, and only the outer product of the principal componentanalysis score sets and the principal components loading set was removedfrom the original descriptor set.

[0026] For multiple concentration or property sets, a principalcomponent analysis model is calculated on said property sets and theabove steps are repeated for each separate principal component analysisscore set using the orthogonal descriptor as the input descriptor setfor each subsequent principal component analysis score set, thus makingup a filtering method for filtering of further samples of the same type.

[0027] Further, performing an ordinary PLS analysis with the filteredresiduals data and the concentration or property set, and an ordinaryPLS analysis with said filtered new residuals set as prediction set.

[0028] In one embodiment of the present invention it is possible, byfinding said orthogonal components for each component separately anamount of disturbing variation in each partial least square componentcan be analyzed.

[0029] Another embodiment uses crossvalidation and/or eigenvaluecriteria for reducing overfitting.

[0030] A further embodiment comprises that principal component analysis(PCA) components are chosen according to a crossvalidation or eigenvaluecriteria

[0031] A still further embodiment comprises that it is designed toremove specific types of variation in the descriptor set, when anunwanted or non-relevant concentration or property set exist by usingthe orthogonal descriptor as a data set of interest, as it contains nocorrelated variation to the concentration or property set.

[0032] The present invention also sets forth an arrangement forconcentration or property calibration of input data from samples ofsubstances or matter, said calibration determining a filter model forfurther samples of the same substance or matter comprising to optionallytransform, center, and scale the input data to provide a descriptor setand a concentration or property set.

[0033] The filter model removes information or systematic variation inthe input data that is not correlated to the concentration or propertyset by comprising:

[0034] projecting means for producing a descriptor weight set, which isnormalized, by projecting the descriptor set on the concentration orproperty set;

[0035] projecting means for the descriptor set on the descriptor weightset producing a descriptor score set;

[0036] projecting for the descriptor set on the descriptor score setproducing a descriptor loading set;

[0037] projecting means for the property set on the descriptor score setproducing a property weight;

[0038] projecting means for the property set on the property weight setproducing a property score set;

[0039] comparing means for the descriptor loading set and the descriptorweight set, and their difference, thus obtaining the part of thedescriptor loading set that is unrelated to the property set;

[0040] using said difference weight set, normalized, as a starting setfor partial least squares analysis;

[0041] calculating means for the corresponding orthogonal descriptorscore set as the projection between the descriptor set and saidnormalized orthogonal difference weight set, and for calculating acorresponding orthogonal descriptor loading set as the projection of thedescriptor set onto the orthogonal descriptor score set;

[0042] calculating means for removing the outer product of theorthogonal descriptor score set and the orthogonal descriptor loadingset from the descriptor set, thus providing residuals data, which isprovided as the descriptor set in a next component.

[0043] Repeatedly using the above means and steps for each orthogonalcomponent, and further comprising:

[0044] filtering means for the residuals data from strong systematicvariation that can be bilinearly modeled as the outer product of theorthogonal descriptor score set and the orthogonal descriptor loadingset, thus providing an orthogonal descriptor set being orthogonal to theproperty set;

[0045] optionally providing analyzing means for a principal componentanalysis (PCA) on the orthogonal descriptor set, producing a bilineardecomposition of the orthogonal descriptor set as the outer product ofthe principal component analysis score set and the principal componentanalysis loading set and principal component analysis residuals, addingthe principal component analysis residuals data back into filteredresiduals data;

[0046] Comprising filtering means for new data, and proceeding with thefollowing means:

[0047] projecting means for a new descriptor set onto the normalizedorthogonal difference weight set, thus producing a new orthogonaldescriptor score set;

[0048] calculating means for removing the product between the neworthogonal descriptor score set and the orthogonal descriptor loadingset from the new descriptor set, thus providing new residuals, which areprovided as a new descriptor et in a next orthogonal component.

[0049] Repeatedly using said filtering of new data for all estimatedorthogonal components, comprising:

[0050] computing means for a new orthogonal descriptor as the outerproduct of the new orthogonal descriptor score set and the orthogonaldescriptor loading set, computing a new orthogonal principal componentscore set from the projection of the new orthogonal descriptor set ontothe principal component analysis loading set, whereby the new principalcomponent analysis residuals are added back into the new residuals(enew′) if principal component analysis was used on the orthogonaldescriptor set, and only removing the outer product of the principalcomponent analysis score sets and the principal components loading setfrom the original descriptor set.

[0051] For multiple concentration or property sets, calculating aprincipal component analysis model on said property sets and repeatedlyusing the above means for each separate principal component analysisscore set and using the orthogonal descriptor as the input descriptorset for each subsequent principal component analysis score set, thusmaking up a filtering method for filtering of further samples of thesame type.

[0052] The arrangement further comprising partial least square analysismeans for the filtered residuals data and the concentration or propertyset, and for said filtered new residuals set as prediction set.

[0053] The arrangement further being capable of performing otherembodiments of the method in accordance with the attached dependentclaims.

BRIEF DESCRIPTION OF DRAWINGS

[0054] For a more complete understanding of the present invention andfor further objectives and advantages thereof, reference may now be hadto the following description taken in conjunction with the accompanyingdrawings, in which:

[0055]FIG. 1 illustrates an overview of orthogonal partial least squares(OPLS) in accordance with the present invention;

[0056]FIGS. 2a and 2 b illustrate a figure example of the effect ofOPLS, where an upper FIG. 2a illustrates column centered untreated NIRspectra, and a lower FIG. 2b shows OPLS treated NIR spectra inaccordance with the present invention;

[0057]FIG. 3 illustrates the norm of an orthogonal vector;

[0058]FIG. 4 illustrates an explained variation for orthogonal PLScomponents;

[0059]FIG. 5 illustrates a first principal orthogonal loading of adisturbing variation in X;

[0060]FIG. 6 illustrates a second principal orthogonal loading of adisturbing variation in X;

[0061]FIG. 7 illustrates a score plot t1-t2 of orthogonal principalcomponents;

[0062]FIG. 8 illustrates a first loading w1 from an original PLS model;

[0063]FIG. 9 illustrates a second loading w2 from original PLS model;

[0064]FIG. 10 illustrates a t1-u1 score plot of an original PLS model;

[0065]FIG. 11 illustrates t1-u1 score plot of OPLS pretreated PLS modelaccording to the present invention;

[0066]FIG. 12 illustrates a first loading w1 from an original PLS model;

[0067]FIG. 13 illustrates a first loading w1 from an OPLS pretreated PLSmodel according to the present invention;

[0068]FIG. 14 illustrates PLS regression coefficients; and

[0069]FIG. 15 illustrates OPLS pretreated PLS regression coefficientsaccording to the present invention.

TABLE

[0070] Tables 1 and 2 referred to in the description are attached to it.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0071] The present invention provides to remove systematic informationfrom an input data set X irrelevant for the concentration or propertyset y or Y. In other words, to remove variability in X that isorthogonal to Y. A pre-processing method with a similar concept asorthogonal signal correction OSC is disclosed through the presentinvention, named orthogonal partial least squares (OPLS).

[0072] Definitions

[0073] A vector, matrix and the like are defined as being sets, forexample, a set may be a vector, a matrix or the like.

[0074] Primed vectors or matrixes are mathematically transposed.

[0075] A component in PCA or PLS represents a new latent variableproduced from summarizing old variables by means of projection.

[0076] A loading set describes the orientation of an obtained componentin relation to original variables in a data matrix X.

[0077] The character y defines a column vector and Y depicts a matrix,i.e., several column vectors.

[0078] The proposed OPLS according to the present invention analyzesdisturbing variation in each PLS component. The disturbing variation inX is separated from relevant variation, improving interpretation andanalysis of filtered input data X, and with the additional bonus thatirrelevant variation itself can be studied and analyzed.

[0079] In an example given using near infrared reflectance (NIR) spectraon wood chips, applying OPLS as preprocessing method, resulted inreduced PLS model complexity with preserved prediction ability,effective removal o f disturbing variation in data, and not at least,improved interpretational ability of both wanted and unwanted variationin data.

[0080] Removing irrelevant variation in data prior to data modeling isinteresting not only from a predictive point of view, theinterpretational ability of resulting models also improves.

[0081] Interpretation of all provided models is very important. Frominterpretation more information and knowledge of a system can beretrieved and analyzed, and developed further.

[0082] Multiplicative scatter correction (MSC), is a method developed byGeladi et al. The method was developed to assist in circumventingproblems found in spectrum from near infrared reflectance spectroscopyin multivariate calibration. Additive and multiplicative scatter effectsproduce variations in a NIR spectrum that are difficult to cope with incalibration models. The MSC method calculates parameters a and b throughregression of each of the spectra onto a target spectrum, usually themean spectrum, x_(m). X_(i)=a_(i)−x_(m)b_(i) Parameters a,b are furtherused to update the NIR spectrum using the following formula:

MSC correction filter X _(i corr)=(X _(i) −a _(i))/b _(i)

[0083] A major problem with MSC is that the parameters a, b aredetermined regardless of Y. In multivariate calibration, this means thatMSC can actually remove variation in X that is relevant for the modelingof y or Y, thereby producing worse calibration models and increaseprediction residuals. The standard normal variate transform (SNV)developed by Barnes is similar to MSC. Here, the updating parameters aand b are calculated from each spectrum individually, a_(i) representsthe mean of spectrum X_(i), and b_(i) is the standard deviation of rowX_(i). SNV is analogous to unit variance (UV) scaling and centering eachrow.${{{SNV}:X_{i}} = {{\left( {X_{i} - a_{i}} \right)/{b_{i}.a_{i}}} = {{mean}\quad {of}\quad {row}\quad i}}},\text{}{b_{i} = {\sqrt{\frac{\sum\left( {X_{i} - a_{i}} \right)^{2}}{d.f.}}\quad {on}\quad {row}\quad {i.}}}$

[0084] The SNV parameters are also determined regardless of Y, whichcould result in worse calibration models and higher predictionresiduals.

[0085] Andersson recently reported another preprocessing method calleddirect orthogonalization (DO), C. A. Andersson, “Directorthogonalization, Chemometrics and intelligent laboratory systems”, 47(1999) 51-63. It also attempts to remove variation in X that isorthogonal to Y. Unfortunately, this method fails because it does not,contrary to what the name implies, guarantee orthogonal informationbeing removed. Therefore DO can not be classified as an orthogonalfiltering method.

[0086] The steps in the direct orthogonalization method are shown below:

[0087] 1.) X and Y are centered.

[0088] 2.) w=X′y(y′y)⁻¹, project X on Y

[0089] 3.) X_ortho=X−Yw′, Orthogonalize X to Y

[0090] 4.) X_ortho=T_orthoP′_ortho+E , Decompose X_ortho into principalcomponents, keep loadings P

[0091] 5.) T=XP_ortho, Calculate new scores T from original X andorthogonal loadings P

[0092] 6.) X_(do)=X−TP′_ortho

[0093] 7.) Calculate calibration model using X_(do) and Y

[0094] 8.) For new column centered samples, T_(pred)=X_(pred)P_ortho

[0095] 9) X_(do) _(—) _(pred)=X_(pred)−T_(pred)P′_ortho

[0096] There are some major problems with the suggested approach. Thecrucial one being step 5. Instead of using the orthogonal scores T_orthofrom step 4, Andersson calculates a new set of scores T with theloadings P_ortho from the original X matrix. These new T scores are notorthogonal to Y, which means that variation in data relevant for themodeling of Y can be removed in step 6, resulting in worse calibrationmodels and higher prediction errors. The same situation occurs for newunknown samples to be predicted in step 8. If the method had used theorthogonal scores calculated in step 4, the variation removed from Xwould have been orthogonal. However, the problem would then be theupdating of new samples because no Y exists for them. This problem wasearlier mentioned by Wold.

[0097] The orthogonal signal correction (OSC) method introduced by Woldet al, represented a new exciting concept in multivatiate data analysisand modeling. The idea behind it is to remove information in X that isirrelevant for the modeling of Y. This is accomplished by removinginformation or variability in X that is non-correlated or orthogonal toY.

[0098] There are three criteria put on the OSC solution

[0099] Should involve large systematic variance in X

[0100] Must be predictive by X (in order to apply on future data)

[0101] Removed information from X must be orthogonal to Y

[0102] The first two criteria are easily met. A regular PCA solutionprovides that. However, the third and most important criteria is noteasily met. It requires a time-consuming iteration to find an OSCsolution that satisfies all three criteria simultaneously.

[0103] A solution often converges quickly, but it still needs 10-20iterations. The OSC solution is not unique but depends on a startingvector t. Therefore, PCA is a good choice to produce the starting vectorbecause it gives the longest t vector, that can be predicted by X. Twoof the criteria above are then automatically met. During an OSCiteration, the length of the t vector will decrease some in order toconverge the OSC solution. Outline of OSC (1) Optionally transform,center and scale the data to give the ’raw’ matrices X and Y (2) Startby calculating the first principal component of X, with the scorevector, t. (3) Orthogonalize t to Y. t_(new)=(I−Y(Y′Y)⁻¹Y′)t (4)Calculate a normalized weight vector, w, that makes Xw=t_(new). This isdone by a PLS estimation giving a generalized inverse = X⁻ w=X⁻t_(new)(5) Calculate a new score vector from X and w t=Xw (6) Check forconvergence, by testing if t has stabilized. Convergence ifnorm(t−t_(new))/ norm(t) < 10⁻⁶, if not convergence, return to step 3,otherwise, continue to step 7. (7) Compute a loading vector, p (neededfor orthogonality between components) p′=t′X/(t′t) (8) Subtract the’correction’ from X, to give residuals, E (9) Continue with the next’component’ using E as X, then another, etc., until satisfaction. (10)New samples (the prediction set) are corrected using W and P of thecalibration model. For each new observation vector, x_(new):t1=x_(new)w1 e₁=x_(new)−t₁p₁, t2=e₁w₂, e2=e₁−t₂p₂’ and so on.

[0104] The main problem with OSC has been concerned with the overfittingof the orthogonal components. Crossvalidation or any other validationmethod is not used in OSC, the additional time needed for this has beenone of the obstacles. The correct number of “internal PLS components” toestimate the orthogonal components are therefore difficult to estimate,leading to overfitting and sometimes degradation of resultingcalibration models. The OSC method is quite computer intensive forlarger data sets (K>2000), due to the iteration to estimate theorthogonal components.

[0105] The orthogonal partial least squares (OPLS) method according tothe present invention has a like criteria as OSC, namely to removevariation in X irrelevant to y or Y. Outline of the OPLS method forsingle properties Optionally transform, center and scale the data togive the ‘raw’ matrices X and y 1. w′ = y′X/y′y Project X on y 2. w =w/══w══ Normalize w 3. t = Xw//(w′w) Project X on w 4. c′ = t′y/(t′t)Project y on t 5. u = yc/(c′c) Project y on c 6. p′ = t′X/(t′t) ProjectX on t 7. W_(ortho) = p − w′p/(w′w)w Find orthogonal loading in p 8.W_(ortho) = w_(ortho)/══w_(ortho)══ Normalize orthogonal loadingw_(ortho) 9. t_(ortho) = X w_(ortho)/(w_(ortho)′w_(ortho)) Project X onW_(ortho) 10. p_(ortho)′ = t_(ortho)′X/(t_(ortho)′t_(ortho)) Project Xon t_(ortho) 11. E_(opls) = X − t_(ortha) P_(ortho)′ Remove orthogonalvariation from X 12. T_(ortho) = [T_(ortho)t_(ortho)], P_(ortho) =[P_(ortho)p_(ortho)], w_(ortho) = [W_(ortho)W_(ortho)]; Save foundparameters. Return to step 1 and set X = E_(opls) for additionalorthogonal components, otherwise continue to step 13. 13. X_(ortho) =T_(ortho)*P_(ortho) ^(′) Orthogonal variation in X. Analyze variationcomponent wise, or run PCA on X_(ortho) (step 14) 14. X_(ortho) =T_(pca) _(—ortho)*P_(pca)_ortho′ + Principal component analysis (PCA)E_(pca) _(—) _(ortho) on X_(ortho) to summarize latent orthogonalvariation Removing all estimated orthogonal variation from X is oneoption. Another option is to only remove the latent orthogonalcomponents estimated from the PCA in step 14. This corresponds to addingE_(pca) _(—) _(ortho) back into E_(opls). 15. New or future samples (theprediction set) are corrected using W_(ortho) and P_(ortho) of thecalibration model. For each new observation vector x_(new)′ repeat steps16-18 for each orthogonal component estimated in the calibration model16. t_(new) _(—) _(ortho) = X_(new)′ w_(ortho)/(w_(ortho)′w_(ortho))Calculate orthogonal score in X_(new)′ 17. t_(new) _(—) _(ortho) =[t_(new) _(—) _(ortho)] Save orthogonal scores for prediction set 18.e_(new) _(—) _(opls)′ = X_(new)′ − t_(new) _(—) _(ortho) P_(ortho)′Orthogonal component in X_(new)′ is removed. Set X_(new)′ = e_(new) _(—)_(opls)′. Proceed to step 19 when all orthogonal components have beenestimated. 19. X_(new) _(—) _(ortho = t) _(new) _(—) _(ortho)*P_(ortho)′ Orthogonal variation in X_(new)′ 20. t_(new) _(—) _(pca)_(—) _(ortho) = X_(new) _(—) _(ortho)P_(pca) _(—) _(ortho) Estimate newscores from PCA in step 14 21. X_(new) _(—) _(ortho) = t_(new) _(—)_(pca) _(—) _(ortho)P_(pca) _(—) _(ortho)′ + If only the orthogonallatent e_(new) _(—) _(pca) _(—) _(ortho) components from PCA onX_(ortho) was removed then e_(new) _(—) _(pca) _(—) _(ortho) should beadded back to e_(new) _(—) _(opls).

[0106] Outline of the OPLS Method for Multiple Properties

[0107] Outline of the OPLS method shown here for a matrix Y withmultiple properties. An example of multiple properties could betemperature and moist content or any other relevant multiple property.Optionally transform, center and scale the raw data to give the matricesX and Y 1. w′ = y′X/(y′y) For each column in Y estimate thecorresponding w, and create a matrix W=[W w]. 2. W = T_(w) P_(w)′ +E_(w) Estimate with principal component analysis (PCA), the principalcomponents of W as long as the ratio of the sum of squares of thecurrent score vector t_(w) divided by the sum of squares of W is largerthan a given threshold, typically 10⁻¹⁰. 3. Estimate a regular multi-YPLS component with given X and Y. (steps 4-9) 4. Initialize multi-Y PLScalculation by setting a column in Y to u. 5. w′ = u′X/(u′u) Repeatsteps 5-9 until convergence. 6. w = w/||w|| 7. t = Xw/(w′w) 8. c′ =t′Y/(t′t) 9. u = Yc/(c′c) Check convergence, if||u_(new)−u_(old)||/||u_(new)||>10⁻¹⁰ continue to step 10, otherwisereturn to step 5. 10. p′ = t′X/(t′t) To estimate an orthogonalcomponent, go to step 11, otherwise go to step 17. 11. p = (p −((t_(w)′p)/(t_(w)′t_(w)))t_(w) Orthogonalize p to each column in T_(w),then set w_(ortho) = p. In this way orthogonality to all Y variables areensured for resulting orthogonal score vector in step 13. p vector inthis step can also be an arbitrary vector (e.g. PCA loading of X) 12.w_(ortho) = w_(ortho)/||w_(ortho)|| 13. t_(ortho) =Xw_(ortho)/(w_(ortho)′w_(ortho)) 14. p_(ortho)′ =t_(ortho)′X/(t_(ortho)′t_(ortho)) 15. E _(PLS) = X − t_(ortha)p_(ortho)′ E _(PLS) are the filtered data. 16. Save found parametersT_(ortho) = [T_(ortho) t_(ortho)], P_(ortho)=[P_(ortho) p_(ortho)],W_(ortho) w_(ortho)]. Return to step 4 and set X=E_(O-PLS). 17. To findorthogonal variation for the next PLS component, remove current PLScomponent from X and Y and save the parameters for this PLS componentfor future samples E = X − tp′, F = Y − tc′, T_(pls)=[T_(pls)t],W_(pls)=[W_(pls) w], P_(pls)=[P_(pls)p] and return to step 1 and set X=Eand Y=F, otherwise to stop, go to step 18. 18. X_(ortho)=T_(ortho)P_(ortho)′ Analyze orthogonal variation component wise, or runPCA on X_(ortho) (step 19) 19. X_(ortho)=T_(pca) _(—) _(ortho)P_(pca)_(—) _(ortho)′ + E_(pca) _(—) _(ortho) Principal component analysis(PCA) of X_(ortho) to summarize the systematic orthogonal variation.Removing all estimated orthogonal variation from X is one option,another option is to only remove the principal orthogonal componentsestimated in step 19. This corresponds to adding E_(pca) _(—) _(ortho)back into E_(O-PLS). 20. E_(O-PLS)=E_(O-PLS) + T_(pls)P_(pls)′   Add thePLS components removed back into E_(O-PLS) which now contains thefiltered data. 21. New or future samples (the prediction set) arecorrected using W_(ortho), P_(ortho), W_(pls) and P_(pls) from thecalibration model. For each new observation vector x_(new)′, repeatsteps 22-26 for each component (O-Pbs or PLS) in the order they werecalculated in the calibration model. 22. If component = OPLS: t_(new)_(—) _(ortho) = x_(new)′W_(ortho)/(W_(ortho)′w_(ortho)) 23. If component= OPLS: t_(new) _(—) _(ortho)′ = [t_(new) _(—) _(ortho)′t_(new) _(—)_(ortho)] Save orthogonal scores for prediction set. The first t in thebrackets is a vector while the second t is a scalar. 24. If component -OPLS: e_(new) _(—) _(O-PLS)′ = x_(new)′ − t_(new) _(—)_(ortho)p_(ortho)′   Orthogonal component in x_(new)′ is removed. Setx_(new)′ = e_(new) _(—) _(O-PLS)′ for additional components and returnto step 21, otherwise proceed to step 27. 25. if component = PLS:t_(new) _(—) _(pls) = x_(new)′ − t_(new) _(—)_(pls)w_(pls)/(w_(pls)′w_(pls)), t_(new) _(—) _(pls)′ = [t_(new) _(—)_(pls)′t_(new) _(—) _(pls)]. The firs t in the brackets is a vectorwhile the second t is a scalar. 26. if component = PLS: t_(new) _(—)_(pls)′ = x_(new)′ − t_(new) _(—) _(pls)p_(pls)′ PLS component inx_(new)′ is removed. Set x_(new)′ = e_(new) _(—) _(pls)′ and return tostep 22. 27. x_(new) _(—) _(ortho)′ = t_(new) _(—) _(ortho)′P_(ortho)′28. t_(new) _(—) _(pca) _(—) _(ortho)′ = x_(new) _(—) _(ortho)′P_(pca)_(—) _(ortho)   Estimate new scores from PCA loadings in step 19 x_(new)_(—) _(ortho)′ = t_(new) _(—) _(pca) _(—) _(ortho)′P_(pca) _(—)_(ortho)′ + e_(new) _(—) _(pca) _(—) _(ortho)′ If only the orthogonallatent components from PCA on X_(ortho) was removed then e_(new) _(—)_(ortho)′ should be added back to e_(new) _(—) _(O-PLS)′. 29. e_(new)_(—) _(O-PLS)′ =e_(new) _(—) _(O-PLS)′ + t_(new) _(—) _(pls)′P_(pls)′Add the PLS components back into e_(new) _(—) _(O-PLS)′ which containsthe filtered data.

[0108] For multiple Y, run principal component analysis (PCA) on Y andrepeat method above for each separate Y score. Use X_(ortho) as theinput X matrix after the first round. Then orthogonality for all Yvariables are guaranteed.

[0109] In the area of semi-empirical modeling, the obvious advantageswith OPLS are more parsimonious PLS model (fewer components) andimproved interpretation because the disturbing variation, and therelevant variation have been separated. OPLS should give an improveddetection limit for moderate outliers in the scores because irrelevantvariations in X could have different statistical distributions than therelevant variation, producing a disturbance to the calculation of forexample the Hotelling's T² statistic.

[0110] Another advantage with OPLS compared to earlier proposed OSCmethod is that no time-consuming internal iteration is present, makingit very fast to calculate. Also, the risk of overfitting is greatlyreduced with OPLS, because crossvalidation and/or some eigenvaluecriteria is used, resulting in systematic and relevant components beingcalculated and extracted. OPLS is a modification of the original PLSNIPALS method for effectively separating relevant and irrelevantvariation in order to improve the interpretation of data. The number oforthogonal components should be selected according to a significancecriteria. Used herein is a combination of looking at how much orthogonalvariation is removed for each component, and the normalized differencebetween p and w_(ortho). A regular crossvalidation with Y is notpossible because the c weight parameter is always zero for allorthogonal components.

[0111]FIG. 1 provides an overview of the OPLS method according to thepresent invention. l he overview is commented in blocks of text in FIG.1 where the advantages and disadvantages regarding OPLS pretreated PLSmodels versus PLS models without pretreatment are stated. A regular PLSmodel, thus is harder to interpret, more components are needed and Xcontains irrelevant information. The OPLS pretreated PLS model, thus hasthe advantages of being easier to interpret, the resulting PLS model ismore parsimonious, plots of parameters such as scores and loadings aremore relevant and predictions can be improved.

[0112] Principal component analysis (PCA) can be used to decompose thealready orthogonal matrix X_ortho into orthogonal principal components.The number of PCA components can be chosen according to somesignificance criteria, i.e. crossvalidation or eigenvalue. Analyzing theirrelevant variation is most valuable, the source of disturbingvariation can perhaps be identified and removed, or at least understoodwhere it comes from.

[0113] It is important to realize that all variation in the scores andloading plots are disturbances and can be interpreted without regardingthe influence on Y. The information from such analysis is veryimportant, not at least for industrial process data which contain largeunknown variations due to fluctuating process environments that are hardto remove, but awareness of what they are could be vital for furtherprocess improvements. Also, instead of removing the orthogonal PLScomponents from the original data, another suggested approach that workswell is to run PCA on the orthogonal data matrix, and only remove theprincipal components from the orthogonal data matrix. The residual leftis inserted into the OPLS treated X matrix. This has shown to improvethe predictions, and to decrease the total number of components useddrastically. The results of such analysis are shown in attached Table l.

[0114] According to the present invention there also exists anotheruseful method to estimate and remove irrelevant variation from X withrespect to a given Y, and that method converges with the OPLS solutionif PCA is used on the orthogonal data matrix X_(ortho) as suggested inOPLS method. This method is herein named projected orthogonal signalcorrection, POSC. It requires an initial fully estimated PLS model to becalculated, and it can not extract an orthogonal PLS component for eachPLS component as OPLS is able to.

[0115] The method of the present invention can thus be summarized as amethod for concentration or property calibration of input data fromsamples of substances or matter, said calibration determining a filtermodel for further samples of the same substance or matter comprising tooptionally transform, center, and scale the input data to provide adescriptor set X and a concentration or property set y, Y. It removesinformation or systematic variation in the input data that is notcorrelated to the concentration or property set by providing the stepsof:

[0116] producing a descriptor weight set w, which is normalized, byprojecting the descriptor set X on the concentration or property set y,Y, projecting the descriptor set X on the descriptor weight set wproducing a descriptor score set t, projecting the descriptor set X onthe descriptor score set t producing a descriptor loading set p,projecting the property set y on the descriptor score set t producing aproperty weight set c, projecting the property set y on the propertyweight set c producing a property score set u;

[0117] comparing the descriptor loading set p and the descriptor weightset w, and their difference p-w, thus obtaining the part of thedescriptor loading set p that is unrelated to the property set y;

[0118] using said difference weight set wortho, normalized, as astarting set for partial least squares analysis;

[0119] calculating the corresponding orthogonal descriptor score settortho as the projection between the descriptor set X and saidnormalized orthogonal difference weight set wortho, and calculating acorresponding orthogonal descriptor loading set portho as the projectionof the descriptor set X onto the orthogonal descriptor score set tortho;

[0120] removing the outer product of the orthogonal descriptor score settortho and the orthogonal descriptor loading set portho′ from thedescriptor set X, thus providing residuals data E, which is provided asthe descriptor set X in a next component;

[0121] repeating the above steps for each orthogonal component;

[0122] the residuals data E now being filtered from strong systematicvariation that can be bilinearly modeled as the outer product of theorthogonal descriptor score set and the orthogonal descriptor loadingset Tortho*Portho′, thus providing an orthogonal descriptor set Xorthobeing orthogonal to the property set y, Y.

[0123] Optionally a principal component analysis (PCA) can be providedon the orthogonal descriptor set Xortho, producing a bilineardecomposition of the orthogonal descriptor set Xortho as the outerproduct of the principal component analysis score set and the principalcomponent analysis loading set and principal component analysisresiduals Tpcaortho*Ppcaortho′+Epcaortho. Whereby the principalcomponent analysis residuals data Epcaortho can be added back intofiltered residuals data E.

[0124] For filtering new data, the following steps are proceeded:

[0125] projecting a new descriptor set xnew′ onto the normalizedorthogonal difference weight set wortho, thus producing a new orthogonaldescriptor score set tnewortho;

[0126] removing the product between the new orthogonal descriptor scoreset tnewortho and the orthogonal descriptor loading set portho′ from thenew descriptor set xnew′, thus providing new residuals enew′, which areprovided as a new descriptor set xnew′ in a next orthogonal component.

[0127] The filtering steps being repeated for new data for all estimatedorthogonal components; and

[0128] computing a new orthogonal descriptor setxnewortho′=tnewortho*Portho′ as the outer product of the new orthogonaldescriptor score set tnewortho and the orthogonal descriptor loading setportho′, computing a new orthogonal principal component score settnewpcaortho from the projection of the new orthogonal descriptor setonto the principal component analysis loading set xnewortho′*Ppcaortho′.Whereby the new principal component analysis residuals formedenewpcaortho=xnewortho′−tnewpcaortho* Ppcaortho′ are added back into thenew residuals enew′ if principal component analysis was used on theorthogonal descriptor set Xortho, and only the outer product of theprincipal component analysis score sets and the principal componentsloading set Tpcaortho*Ppcaortho′ was removed from the originaldescriptor set X.

[0129] For multiple concentration or property sets Y, calculating aprincipal component analysis model on said property sets Y=TP′+E andrepeating the above steps for each separate principal component analysisscore set t and use the orthogonal descriptor X_(ortho) as the inputdescriptor set X for each subsequent principal component analysis scoreset t, thus making up a filtering method for filtering of furthersamples of the same type.

[0130] Proceeding with performing an ordinary PLS analysis with thefiltered residuals data E and the concentration or property set y, Y,and with said filtered new residuals set enew′ as prediction set.

[0131] The present invention also sets forth an arrangement forconcentration or property calibration of input data from samples ofsubstances or matter.

[0132] A filter model comprised in the arrangement removes informationor systematic variation in the input data that is not correlated to theconcentration or property set by comprising:

[0133] projecting means for producing a descriptor weight set w, whichis normalized, by projecting the descriptor set X on the concentrationor property set y, Y;

[0134] projecting means for the descriptor set X on the descriptorweight set w producing a descriptor score set t;

[0135] projecting means for the descriptor set X on the descriptor scoreset t producing a descriptor loading set p;

[0136] projecting means for the property set y on the descriptor scoreset t producing a property weight set c;

[0137] projecting means for the property set y on the property weightset c producing a property score set u;

[0138] comparing means for the descriptor loading set p and thedescriptor weight set w, and their difference p-w, thus obtaining thepart of the descriptor loading set p that is unrelated to the propertyset y, Y;

[0139] using said difference weight set wortho, normalized, as astarting set for partial least squares analysis;

[0140] calculating means for the corresponding orthogonal descriptorscore set tortho as the projection between the descriptor set X and saidnormalized orthogonal difference weight set wortho, and for calculatinga corresponding orthogonal descriptor loading set portho as theprojection of the descriptor set X onto the orthogonal descriptor scoreset tortho;

[0141] calculating means for removing the outer product of theorthogonal descriptor score set tortho and the orthogonal descriptorloading set portho′ from the descriptor set X, thus providing residualsdata E, which is provided as the descriptor set X in a next component.

[0142] Repeatedly using the above means and steps for each orthogonalcomponent;

[0143] filtering means for the residuals data E from strong systematicvariation that can be bilinearly modeled as the outer product of theorthogonal descriptor score set and the orthogonal descriptor loadingset Tortho*Portho′, thus providing an orthogonal descriptor set Xorthobeing orthogonal to the property set y, Y;

[0144] The arrangement is optionally providing analyzing means for aprincipal component analysis (PCA) on the orthogonal descriptor setXortho, producing a bilinear decomposition of the orthogonal descriptorset Xortho as the outer product of the principal component analysisscore set and the principal component analysis loading set and principalcomponent analysis residuals Tpcaortho*Ppcaortho′+Epcaortho, adding theprincipal component analysis residuals data Epcaortho back into filteredresiduals data E;

[0145] Filtering means for new data, is proceeding with the followingmeans:

[0146] projecting means for a new descriptor set xnew′ onto thenormalized orthogonal difference weight set wortho, thus producing a neworthogonal descriptor score set tnewortho;

[0147] calculating means for removing the product between the neworthogonal descriptor score set tnewortho and the orthogonal descriptorloading set portho′ from the new descriptor set xnew′, thus providingnew residuals enew′, which are provided as a new descriptor set xnew′ ina next orthogonal component;

[0148] Repeatedly using said filtering of new data for all estimatedorthogonal components;

[0149] computing means for a new orthogonal descriptor setxnewortho′=tnewortho*Portho′ as the outer product of the new orthogonaldescriptor score set tnewortho and the orthogonal descriptor loading setportho′, computing a new orthogonal principal component score settnewpcaortho from the projection of the new orthogonal descriptor setonto the principal component analysis loading set xnewortho′*Ppcaortho′.Whereby the new principal component analysis residuals formedenewpcaortho=xnewortho′−tnewpcaortho*Ppcaortho′ are added back into thenew residuals (enew′) if principal component analysis was used on theorthogonal descriptor set Xortho, and only removing the outer product ofthe principal component analysis score sets and the principal componentsloading set Tpcaortho*Ppcaortho′ from the original descriptor set X.

[0150] For multiple concentration or property sets Y, calculating aprincipal component analysis model on said property sets Y=TP′+E andrepeatedly using the above means and steps for each separate principalcomponent analysis score set t and using the orthogonal descriptorX_(ortho) as the input descriptor set X for each subsequent principalcomponent analysis score set t, thus making up a filtering method forfiltering of further samples of the same type.

[0151] Applying partial least square analysis means for the filteredresiduals data E and the concentration or property set y, Y, and forsaid filtered new residuals set enew′ as a prediction set.

[0152] It is to be understood that the means making up the arrangementcan be purely software means, hardware means known in the art orcombinations of them. Outline of the POSC method for a single property1.) Optionally transform, center and scale the data to give the ‘raw’matrices X and y 2.) t = X*w Calculate normalized loading w from someregression method to estimate t representing the best systematiccorrelation to Y. 3.) p′ = t′X/(t′t) Project X on t to get loading p.4.) X_ortho = X − tp′ 5.) X_ortho = T_orthoP_ortho′ + E_ortho Calculatea regular PCA model with xx number of chosen components 6.) Xposc = X −T_orthoP_ortho′ Filtered data Xposc 7.) New or future data (theprediction set) are corrected using w,p and P_ortho from the calibrationmodel. For each new observation vector x_test′ repeat steps 11-14 foreach orthogonal component estimated in the calibration model 8.) t_test= x_test′w Calculate score in x_test′ 9.) x_test_ortho′ = x_test′ −t_test p′ Orthogonal variation in new data 10.) Repeat steps 11 and 14for each orthogonal component removed in step 5 11.) t_test_ortho =x_test_ortho′p_ortho 12.) t_test_ortho = [t_test_ortho t_test_ortho]Save orthogonal scores for prediction set 13.) e_test_ortho′ =x_test_ortho′ − t_test_ortho p_ortho′ 14.) For each remaining orthogonalcomponent, set x_test_ortho′ = e_test_ortho′ and return to step 11, elseproceed to step 15 15.) xposc_test′ =x_test′ − t_test_orthoP_ortho′Filtered new data xposc_test′

[0153] Outline of the POSC Method for Multiple Properties

[0154] Outline of the proposed POSC method shown here for the matrices Xand Y where Y has multiple properties. An example of multiple propertiescould be temperature and moist content or any other relevant multipleproperty. Optionally transform, center avid scale the raw data to givethe matrices X and Y 1. T = XW Calculate the normalized regressioncoefficients W from some regression method (e.g. PLS) to estimate T,representing the best systematic correlation to Y. 2.T=T_(pca)P_(pca)′+E_(pca) Estimate with principal component analysis(PCA), the principal components of T as long as the ratio of the sum ofsquares of the current score vector t_(pca) divided by the sum ofsquares of T is larger than a given threshold, typically 10^(−10.) 3.p′= t_(pca)′X/(t_(pca)′t_(pcapl )) Estimate p for each column inT_(pca), resulting in matrix P 4. X_(ortho) = X − T_(pca)P′ 5. X_(ortho)= T_(ortho)P_(ortho)′ + E_(ortho) Calculate a PCA model 6. X_(pose) = X− T_(ortho)P_(ortho′) Remove the systematically irrelevant variation 7.New or future data (the prediction set) are corrected using W, P_(pca),P and P_(ortho) from the calibration model. For each new observationvector x_(test)′ repeat steps 10-13 for each orthogonal componentestimated in the calibration model 8. t_(test)′ = x _(—) _(test)′W 9.t_(testpca)′ = t_(test)′P_(pca) 10. x_(test) _(—) _(ortho) = x_(test)′ −t_(testpca)′P′ Orthogonal variation in new data 11. Repeat steps 12-14for each orthogonal principal component removed in step 6 12. t_(test)_(—) _(ortho) = x_(test) _(—) _(ortho)′p _(—) _(ortho)′ 13. t_(test)_(—) _(ortho)′ = [t_(test) _(—) _(ortho)′t_(test) _(—) _(ortho)] Savet_(test) _(—) _(ortho), the first t in the brackets is a vector whilethe second t is a scalar. 14. e_(test) _(—) _(ortho)′ = x_(test) _(—)_(ortho)′ − t_(test) _(—) _(ortho)′p _(—) _(ortho)′ For each remainingorthogonal component, set x_(test) _(—) _(ortho)′ = e_(test) _(—)_(ortho)′ and return to step 12 to remove any additional orthogonalcomponent earlier estimated, else proceed to step 15. 15. x_(posc) _(—)_(test)′ = x_(test)′ − t_(test) _(—) _(ortho)′P_(ortho)′ Filtered newdata x_(posc) _(—) _(test)′

[0155] All projection methods working after some least squaresmethodology are sensitive to abnormal occurrences in data. PLS and PCAare no different. It is important to realize that detection andinvestigation of abnormal data or outliers represent an important partin multivariate data analysis and semi-empirical modeling. In PLS andPCA the abnormal samples of data can be detected and analyzed by lookingat scores and residuals. Outlier detection in OPLS presents noadditional problem because the same principles apply.

[0156] The first steps in OPLS are to estimate the loadings w and p.OPLS calculates w and p with the generalized regression methodprojections to latent structures partial least squares (PLS). The numberof orthogonal and non-orthogonal components used should be selectedaccording to some significance criteria. The well known crossvalidationtechnique can be used. It is important to use some significance criteriawhen determining the number of PLS components. Underfitting oroverfitting of data is serious to any empirical modeling, andconsequently also for OPLS.

[0157] The OPLS model and the original PLS model will give similarresults regarding explained variance in Y unless the OPLS treated Xmatrix is scaled prior to PLS modeling. Improvements in prediction inthe resulting OPLS model compared to the original PLS model can occur ifthe OPLS treated data matrix X is scaled prior to PLS modeling. Scalingmethods such as unit variance (UV) scaling where each column is dividedwith the standard deviation of that column, or pareto scaling where theweight factor is the square root of the standard deviation for thatcolumn is recommended. It is important to realize that the orthogonalvariation in X removed from calibration samples, are assumed to exist infuture samples as well. This assumption stems from the fact that theupdating parameters W*_ortho, and P_ortho are estimated from thecalibration data only. No variation only present in future samples, oronly in calibration samples should be removed, this stresses theimportance to only remove systematically relevant orthogonal componentsfrom the data. Removing that systematic irrelevant variation, and thenapplying scaling on the OPLS treated data can improve predictions.

[0158] The suggested OPLS method is truly versatile and if properlyused, OPLS will improve data modeling, and interpretation regardless ofmost types of data properties. Suppose that a data set only containsrelevant variation, consequently the OPLS method will not find anyorthogonal components, and the resulting PLS model converges to aregular one component PLS solution. This is the case for data fromdesigned experiments, where columns are orthogonal with respect to eachother, and no orthogonal latent variables are present Also consider theopposite case, the data set only consisting of non-relevant information,here the OPLS method only finds orthogonal irrelevant variation, and noPLS component and therefore converges to a PCA solution.

[0159] OPLS can be designed to remove specific information. Instead ofremoving all systematic non-relevant variation in X, OPLS can bedesigned to remove specific types of variation in the X data. This isdone by setting the unwanted property to y or Y. One example being thecommon problem of temperature differences for samples in NW spectra thatproduce unwanted variation in the spectra. This provides the opportunityto analyze the specific systematic variation in X produced fromtemperature differences and also further analyze the OPLS treated dataXortho with minor influence of the disturbing variation that has beensafely removed.

[0160] The OPLS method use the provided X and y or Y data to filter andremove variation in X not relevant to Y. If the given Y data include agreat deal of noise, then there has been a concern that OPLS might notperform as well as it should, although the information removed from Xindeed always is orthogonal to Y. Results from initial studies do notshow any degradation of results compared to non-treated data. The use ofcrossvalidation, and/or eigenvalue criteria guides OPLS to perform wellon most different types of data.

[0161] Perhaps the greatest advantage with the OPLS method is theimproved interpretation of the data. Imagine a twelve component PLSmodel raising the questions: What are the interesting variables forprediction of the response variable Y? Which plots and parameters areinteresting?

[0162] Some would analyze the regression coefficient vector, becausethat is used to predict Y from X. Others would say a combination of allloadings, scores, and the coefficient vector together with a look at theresiduals. That sounds rather tough, but together with prior knowledgeof the data that usually works. The OPLS method would makeinterpretation easier. First of all, it separates the relevantinformation from the non-relevant orthogonal information. Second of all,it gives an opportunity to analyze the non-relevant in formation in thedata, and understand what the sources are for that. Third of all, thenumber of PLS components becomes much smaller, usually one or twocomponents. Interpreting and analyzing the OPLS model clearly becomesmuch easier, and an interesting fact is that the first OPLS w loading isequivalent to the first PLS w loading of the original PLS model. That iseasy to understand, but this raises an interesting point of the qualityof the interpretation of PLS models.

[0163] Principal component analysis (PCA), the workhorse in multivariatedata analysis. Only a brief description will be given here, moreinformation can be found in H. Martens, T. Naes, MultivariateCalibration, Wiley, New York, 1989. Any data matrix X of size N*K, whereN denotes the number of objects (rows), and K the number of variables(columns) can be decomposed into a number of principal components withPCA.

X=TP′+E

[0164] PCA evaluates the underlying dimensionality (latent variables) ofthe data, and gives an overview of the dominant patterns and majortrends in the data.

[0165] Partial least squares (PLS), is a projection method that modelsthe relationship between the response Y and the predictors X, see H.Martens et al. Blocks are decomposed as follows

X=TP′+E

Y=UC′+F

[0166] Here T and U are the score matrices and P and C are the loadingmatrices for X and Y respectively, E and F are the residual matrices.The x-scores t_(a) are linear combinations of the X-residuals or Xitself where w is the weight vector.

t _(a)=(X−T _(a−1) *P _(a−1))*w _(a)

[0167] This is provided in a way to maximize the covariance between Tand U. W* are the weights that combine the original X variables (nottheir residuals as with w) to form the scores t. W*=W*(P′*W)⁻¹

[0168] U is related to T by the inner relation

U=T+H H=Residual matrix

[0169] The predictive formulation for Y is as follows

Y=TC′+F* F* is the residual matrix.

[0170] The following statistics for the regression models have beencalculated. Explained variance of X, of training set.

R2(X)=1−Σ({circumflex over (X)}−X)² /ΣX ²

[0171] Explained variance of y, of training set.

R2(y)=1−Σ(ŷ−y)² /Σy ²

[0172] The predicted crossvalidated variance of Y, of training set.

Q2(y)=1−Σ(ŷ _(pred) −y)² /Σy ²

[0173] Root mean square error of prediction, of test set${RMSEP} = \sqrt{\frac{\sum\left( {\hat{y} - y} \right)^{2}}{N}}$

[0174] Distance to Model in X space, DmodX, Normalized residual standarddeviation in X space.${{DModX}(i)} = \frac{\sqrt{\frac{\sum\left( E_{ik} \right)^{2}}{\left( {K - A} \right)}}}{s_{0}}$$s_{0} = \sqrt{\frac{\sum{\sum E_{ik}}}{\left( {N - A - A_{0}} \right)*\left( {K - A} \right)}}$

[0175] K=number of X variables, A=number of PLS components, A₉=1 forcolumn centered data. E residual matrix. N=number of objects in X.

[0176] NIR-VIS spectra were collected from the wavelength region400-2500 nm. A NIR-Systems 6500 spectrometer was installed on top of aconveyer belt, and 151 baskets filled with different wood chipscompositions were measured next to the conveyer belt at ASSI Domän pulpplant in Pite{dot over (a)}, Sweden. The dry content was measured usinga reference method. The wood chips dry content varied from 39-58%. Fromthe data set N*K, where N=151 samples, and K=1050 digitized wavelengths,51 spectra were randomly removed as a test set, leaving 101 spectra usedas a training set for calibration The number of PLS components werecalculated according to crossvalidation.

[0177]FIG. 2a illustrates column centered untreated NIR spectra and FIG.2b OPLS treated NIR spectra in accordance with the present invention.

[0178] In FIGS. 2a and 2 b, a comparison of NTR spectra before and afterOPLS treatment is provided. The untreated NIR spectra displays a clearbaseline variation with little relevance for moisture content Y, shownin the upper right corner of FIG. 2a. Irrelevant variation, baseline andslope problems, have been reduced as depicted in FIG. 2b. A differencein moisture content among samples produce most of the variation, and thebaseline and slope problems earlier noted have been greatly reduced.

[0179] The result in table 1 shows a clear reduction in the number ofPLS components needed when OPLS was used. The OSC method could onlyextract one component which was somewhat overfitted. Using a second OSCcomponent resulted in a more serious overfit of the PLS model. It shouldbe noted that the SNV, DO, and MSC methods did not perform well. Theyactually increased the prediction error, and worsened the resultscompared to the original PLS model. Neither of those methods guaranteeorthogonality as to what is being removed from X, and therefore relevantinformation is sometimes lost. It should be pointed out that the DOmethod did not remove variation orthogonal to Y, and therefore alsoproduced higher prediction errors than the original PLS model. Note thatif PCA is used on Xortho to find the latent orthogonal components, andinstead those are removed from X and not the whole Xortho matrix andscaling is applied, a clear decrease in the total number of componentsresulted. This shows that scaling after OPLS could improve modeling.

[0180] Determining a correct number of OPLS components should be madeaccording to a significance criteria. The regular crossvalidationprocedure can not be used here because the Y weight vector c becomeszero for all OPLS components. We suggest looking at the amount oforthogonal variation removed for each component, and also the norm ofthe w_(ortho) vector found in step 8 in the OPLS method. If the norm issmall compared to the norm of loading p, then little orthogonalvariation in X was found in that component and the number of orthogonalPLS components have been found. An approach could be not to stop, butinstead extract such components as regular PLS components, and continuethe method until all variance in X has been accounted for. This allowsthe OPLS method to continue searching for orthogonal components hiddenunderneath the first ones. In FIG. 3, the normalized norm of thew_(ortho) vector is shown. A clear trend is visible, and four orthogonalcomponents were removed. Whether or not the fourth orthogonal componentis relevant or not is difficult to say, as was the case for the originalPLS model where the fifth PLS component was lose to being irrelevant. InFIG. 4, the explained variation for each orthogonal PLS component isplotted, and there is a clear similarity to FIG. 3. As a rule of thumb,the total number components for OPLS should never exceed the number ofPLS components for the original PLS model.

[0181]FIG. 3. Illustrati

m of an orthogonal vector w_(ortho) and FIG. 4 an explained variationfor each orthogonal PLS component.

[0182] It is a great advantage with OPLS, to be able to analyze thenon-relevant information in X as orthogonal principal components. Itshould be clear, that all information in those plots, scores andloadings, are systematically non-relevant, orthogonal, to the wantedquality variable Y.

[0183] The first two orthogonal loadings are plotted in FIGS. 5 and 6.FIG. 5 illustrates a first principal orthogonal loading disturbingvariation in X and FIG. 6 a second principal orthogonal loading ofdisturbing variation in X.

[0184] It can clearly be seen that the orthogonal components are in factbasically an offset shift, and a slope difference. T hose irrelevantvariations were detected and removed with OPLS. How come themultiplicative scatter correction (MSC) method, designed to remove thesetypes of disturbances from NIR spectra, did not manage to produce betterresults? One simple reason could be that the MSC target vector (usuallycolumn mean vector) used to correct all other spectra is not a goodchoice. The OPLS method finds those correction vectors from the data,and also guarantees that the information removed is not relevant for themodeling of Y. The corresponding score plot for the orthogonal latentcomponents is shown in FIG. 7. FIG. 7 illustrates a score plot t1-t2 oforthogonal principal components.

[0185] OPLS gives the possibility to analyze the irrelevant variation indata in orthogonal components (scores and loadings). All variation inthe score plots are of no relevance for Y, and therefore the source ofunwanted variation can perhaps be found and removed, or reduced. Inindustrial processes it is not always possible to remove unwantedvariation, but OPLS offers the advantage to at least know what type ofdisturbing variation that exists, and possibly find methods to reduceit. Is it possible to know in advance where applying OPLS will help? Inprinciple; all PLS models with more than one PLS component shouldbenefit from using OPLS. Consider designed data with orthogonalvariables. Only one PLS component is needed because no latent orthogonalvariation is present. In FIGS. 8 and 9, a good example of when to useOPLS is displayed and this phenomenon often occurs for NIR spectra. Thefirst two w loadings from the original PLS model arc plotted.

[0186]FIG. 8 illustrates a first loading w1 from an original PLS modeland FIG. 9 a second loading w2 from an original PLS model.

[0187] The reason why the first two loadings are similar is that the Xdata matrix contains large baseline variations (non-relevant) orthogonalto Y. This causes problems for the PLS method. PLS is forced to includesome X-Y covariance in each PLS component, even though the a great dealof the X variation is orthogonal to Y. PLS solves this by peeling ofinformation from the X matrix in a couple of components, leading to amore complex PLS model harder to interpret.

[0188] Table 2 shows the PLS model parameters for the original PLSmodel, and also the OPLS treated PLS model. A clear sign when to use theOPLS preprocessing method is given in Table 2. The amount of varianceR2Ycum, (or crossvalidated Q2cum) is relatively small in the firstcomponent. The amount of explained variation in X, R2Xcum is ratherlarge. This is a clear indication that large orthogonal variation withregards to Y exists in X, and where OPLS could improve modeling andinterpretation. In this case, the orthogonal components revealed thatbaseline variations were the cause of the large unwanted variation in X.For other types of data, the orthogonal components could appear later inthe PLS components, but here the baseline problems introduced such largevariation in data, and it appeared in the first component. The OPLSmodel required only one PLS component because all relevant orthogonallatent variation had already been removed. Once again consider designedexperiments where the data have orthogonal variables. These models onlyrequire one PLS component, and the reason is easy to understand. Thedesigned data do not contain any orthogonal latent variation, andtherefore the PLS model only needs one component.

[0189] The number of PLS components for the OPLS pretreated PLS modelare greatly reduced, making interpretation easier compared to theoriginal PLS model. The scores are not masked anymore with irrelevantvariation, which has been greatly suppressed with OPLS.

[0190]FIG. 10 illustrates a t1-u1 score plot of an original PLS modeland FIG. 11 a t1-u1 score plot of an OPLS pretreated PLS model accordingto the present invention.

[0191] The t-u correlation in the OPLS treated PLS model is much moredistinct and clear. The original PLS model FIG. 10 do not show muchcorrelation in the first PLS component, and the reason is mainly due tothe disturbing baseline variation. In FIG. 11 the baseline variations,and slope differences have been removed, and the PLS model only needsone component to model the X-Y relationship.

[0192] In FIGS. 12 and 13, the first loading vector w is plotted for theoriginal PLS model, and the OPLS pretreated PLS model. Notice that thefirst loading w in the original PLS model, is identical to the firstloading w in the OPLS pretreated PLS model. This is easily understoodwhen realizing that w is the projection of the matrix X onto the vectoru, y if only one column in Y, using the NIPALS method. Removingorthogonal components from X do not disturb the correlation between Xand Y because orthogonal columns in X do not influence the projectionw=u′X/(u′u).

[0193]FIG. 12 illustrates a first loading wl from an original PLS modeland FIG. 13 a first loading wl from an OPLS pretreated PLS model.

[0194] This brings us to the interesting subject of interpretation. Boththe original PLS model, and OPLS pretreated PLS model have the samefirst loading weight vector w, but different scores are produced. Thismeans that the PLS weight vector w is not very useful when orthogonalvariation is present. The loading vector p is more relevant to analyzewith respect to the scores. However, the loadings p are influenced byboth the relevant and the irrelevant variation in X mixed together Thismakes the interpretation of the PLS model difficult. Physico-chemicalinterpretation of the scores with regards to what variables areimportant for prediction and what variables produce disturbing variationin X. OPLS splits the two separate variations in data into two differentdata matrices that are analyzed individually and independently of eachother.

[0195] It is clear that the regression coefficients of the original PLSmodel and the OPLS pretreated PLS model must be very different. SeeFIGS. 14 and 15.

[0196]FIG. 14 illustrates PLS regression coefficients and FIG. 15 OPLSpretreated PLS regression coefficients according to the presentinvention.

[0197] The difference originates from the amount of orthogonalinformation in X that is present in the original PLS model. Theregression coefficients display the variables using the current datathat are important for the modeling of Y. Removing as much of theirrelevant variation in X is important to gain relevant and maximumknowledge of the system under investigation, and to keep the modelcomplexity to a minimum. Additionally, analyzing the orthogonalvariation in terms of orthogonal principal components (scores andloading) to find and reduce the irrelevant variation in data couldsometimes be crucial. The suggested OPLS method is a good method toemploy for that purpose.

[0198] The present invention OPLS method has been shown to be genericand versatile. It can be made an integrated part of the regular PLSmodeling, improving interpretation and model predictions, or it can beused as a preprocessing method for removing disturbing variation fromdata. In example given, the disturbing variation was effectively removedand analyzed with the help of principal component analysis (PCA), alsothe resulting one component PLS model was easier to interpret. The OPLSmethod can be seen as a filtering method, where variation irrelevant forthe problem at hand is effectively removed. This applies not only forcalibration purposes, but for all types of filtering where irrelevantvariation in the data X is to be reduced or removed. For example,industrial process signals have drift, and other disturbing variations.Applying OPLS with time as Y would reveal variation in X related to timedrift. Another example is in Quantiative Structure ActivityRelationships (QSAR) modeling. Interpretation of their models is vital,and OPLS offers the possibility to separate the relevant variation indata from the non-relevant variation. Internal validation methods suchas crossvalidation and eigenvalue criteria ensures that the OPLS methodwill work on most types of data. Compared to the earlier proposed OSCmethod, no time consuming iteration is present in the method. BecauseOPLS is based on the PLS-NIPALS method, it works with moderate amountsof missing data. The most clear advantage of using the OPLS method isthe improvement in interpretation of PLS models and their parametersscores, loadings, and residuals.

[0199] It is thus believed that the operation and construction of thepresent invention will be apparent from the foregoing description. Whilethe method and arrangement shown or described has been preferred it willbe obvious that various changes and modifications may be made thereinwithout departing from the spirit and scope of the invention as definedin the attached claims.

Tables

[0200] TABLE 1 Data set: ASSI NIR (Number of comp. according tocrossvalidation) Method # Orthogonal comp. # PLS comp. R2Y Q2Y RMSEP PLS— 5 0.80 0.73 2.95 MSC + PLS — 2 0.80 0.81 3.13 SNV + PLS — 3 0.81 0.813.09 DO + PLS 3 (not orthogonal) 2 0.72 0.68 3.06 OSC + PLS 1 1 0.810.80 3.01 OPLS 4 1 0.80 0.78 2.95 OPLS (PCA) 1 1 0.80 0.78 2.94

[0201] TABLE 2 Original PLS model OPLS PLS model PLS Comp R2Xcum R2YcumQ2cum PLS Comp R2Xcum R2Ycum Q2cum 1 0.948 0.107 0.093 1 0.976 0.7960.782 2 0.987 0.499 0.478 3 0.995 0.629 0.587 4 0.996 0.757 0.695 50.997 0.796 0.731

1. A method for concentration or property calibration of input data fromsamples of substances or matter, said calibration determining a filtermodel for further samples of the same substance or matter comprising tooptionally transform, center, and scale the input data to provide adescriptor set (X) and a concentration or property set (y, Y),characterized in that it removes information or systematic variation inthe input data that is not correlated to the concentration or propertyset by providing the steps of: producing a descriptor weight set (w),which is normalized, by projecting the descriptor set (X) on theconcentration or property set (y, Y), projecting the descriptor set (X)on the descriptor weight set (w) producing a descriptor score set (t),projecting the descriptor set (X) on the descriptor score set (t),producing a descriptor loading set (p), projecting the property set (y)on the descriptor score set (t), producing a property weight set (c),projecting the property set (y) on the property weight set (c),producing a property score set (u); comparing the descriptor loading set(p) and the descriptor weight set (w), and their difference (p-w), thusobtaining the part of the descriptor loading set (p) that is unrelatedto the property set (y) using said difference weight set (wortho),normalized, as a starting set for partial least squares analysis;calculating the corresponding orthogonal descriptor score set (tortho)as the projection between the descriptor set (X) and said normalizedorthogonal difference weight set (wortho), and calculating acorresponding orthogonal descriptor loading set (portho) as theprojection of the descriptor set (X) onto the orthogonal descriptorscore set (tortho); removing the outer product of the orthogonaldescriptor score set (tortho) and the orthogonal descriptor loading set(portho′) from the descriptor set (X), thus providing residuals data(F), which is provided as the descriptor set (X) in a next latentvariable component; repeating the above steps for each orthogonal latentvariable component; the residuals data (E) now being filtered fromstrong systematic variation that can be bilinearly modeled as the outerproduct of the orthogonal descriptor score set and the orthogonaldescriptor loading set (Tortho*Portho′), thus providing an orthogonaldescriptor set (Xortho) being orthogonal to the property set (y, Y);optionally providing a principal component analysis (PCA) on theorthogonal descriptor set (Xortho), producing a bilinear decompositionof the orthogonal descriptor set (Xortho) as the outer product of theprincipal component analysis score set and the principal componentanalysis loading set and principal component analysis residuals(Tpcaortho*Ppcaortho′+Epcaortho), adding the principal componentanalysis residuals data (Epcaortho) back into filtered residuals data(E); for filtering new data, proceeding with the following steps:projecting a new descriptor set (xnew′) onto the normalized orthogonaldifference weight set (wortho), thus producing a new orthogonaldescriptor score set (tnewortho); removing the product between the neworthogonal descriptor score set (tnewortho) and the orthogonaldescriptor loading set (portho′) from the new descriptor set (xnew′),thus providing new residuals (enew′), which are provided as a newdescriptor set (xnew′) in a next orthogonal component; repeating saidfiltering steps for new data for all estimated orthogonal components;computing a new orthogonal descriptor set (xnewortho′=tnewortho*Portho′)as the outer product of the new orthogonal descriptor score set(tnewortho) and the orthogonal descriptor loading set (portho′),computing a new orthogonal principal component score set (tnewpcaortho)from the projection of the new orthogonal descriptor set onto theprincipal component analysis loading set (xnewortho′*Ppcaortho′),whereby the new principal component analysis residuals formed(enewpcaortho=xnewortho′−tnewpcaortho*Ppcaortho′) are added back intothe new residuals (enew′) if principal component analysis was used onthe orthogonal descriptor set (Xortho), and only the outer product ofthe principal component analysis score sets and the principal componentsloading set (Tpcaortho*Ppcaortho′) was removed from the originaldescriptor set (X); for multiple concentration or property sets (Y),calculating a principal component analysis model on said property sets(Y=TP′+F,) and repeating the above steps for each separate principalcomponent analysis score set (t) and use the orthogonal descriptor(X_(ortho)) as the input descriptor set (X) for each subsequentprincipal component analysis score set (t), thus making up saidfiltering model for filtering of further samples of the same type.
 2. Amethod according to claim 1, characterized in that: performing anordinary PLS analysis with the filtered residuals data (E) and theconcentration or property set (y, Y); performing an ordinary PLSanalysis with said filtered new residuals set (enew′) as prediction set.3. A method according to claim 1 or 2, characterized in that by findingsaid orthogonal components for each component separately an amount ofdisturbing variation in each partial least square component can beanalyzed.
 4. A method according to claim 1-3, characterized in that ituses crossvalidation and/or eigenvalue criteria for reducingoverfitting.
 5. A method according to claim 1-4, characterized in thatsaid principal component analysis (PCA) components are chosen accordingto a crossvalidation or eigenvalue criteria
 6. A method according toclaim 1-5, characterized in that it is designed to remove specific typesof variation in the descriptor set (X), when an unwanted or non-relevantconcentration or property set (y) or (Y) exist by using the orthogonaldescriptor (X_(ortho)) as a data set of interest, as it contains nocorrelated variation to the concentration or property set (y, Y).
 7. Anarrangement for concentration or property calibration of input data fromsamples of substances or matter, said calibration determining a filtermodel for further samples of the same substance or matter comprising tooptionally transform, center, and scale the input data to provide adescriptor set (X) and a concentration or property set (y, Y),characterized in that said filter model removes information orsystematic variation in the input data that is not correlated to theconcentration or property set by comprising: projecting means forproducing a descriptor weight set (w), which is normalized, byprojecting the descriptor set (X) on the concentration or property set(y, Y); projecting means for the descriptor set (X) on the descriptorweight set (w) producing a descriptor score set (t); projecting for thedescriptor set (X) on the descriptor score set (t) producing adescriptor loading set (p); projecting means for the property set (y) onthe descriptor score set (t) producing a property weight set (c);projecting means for the property set (y) on the property weight set (c)producing a property score set (u); comparing means for the descriptorloading set (p) and the descriptor weight set (w), and their difference(p-w), thus obtaining the part of the descriptor loading set (p) that isunrelated to the property set (y, Y); using said difference weight set(wortho), normalized, as a starting set for partial least squaresanalysis; calculating means for the corresponding orthogonal descriptorscore set (tortho) as the projection between the descriptor set (X) andsaid normalized orthogonal difference weight set (wortho), and forcalculating a corresponding orthogonal descriptor loading set (portho)as the projection of the descriptor set (X) onto the orthogonaldescriptor score set (tortho); calculating means for removing the outerproduct of the orthogonal descriptor score set (tortho) and theorthogonal descriptor loading set (portho′) from the descriptor set (X),thus pr residuals data (E), which is provided as the descriptor set (X)in a next component; repeatedly using the above means for eachorthogonal latent variable component; filtering means for the residualsdata (E) from strong systematic variation that can be bilinearly modeledas the outer product of the orthogonal descriptor score set and theorthogonal descriptor loading set (Tortho*Portho′), thus providing anorthogonal descriptor set (Xortho) being orthogonal to the property set(y, Y); optionally providing analyzing means for a principal componentanalysis (PCA) on the orthogonal descriptor set (Xortho), producing abilinear decomposition of the orthogonal descriptor set (Xortho) as theouter product of the principal component analysis score set and theprincipal component analysis loading set and principal componentanalysis residuals (Tpcaortho*Ppcaortho′+Epcaortho), adding theprincipal component analysis residuals data (Epcaortho) back intofiltered residuals data (E); filtering means for new data, proceedingwith the following means: projecting means for a new descriptor set(xnew′) onto the normalized orthogonal difference weight set (wortho),thus producing a new orthogonal descriptor score set (tnewortho);calculating means for removing the product between the new orthogonaldescriptor score set (tnewortho) and the orthogonal descriptor loadingset (portho′) from the new descriptor set (xnew′), thus providing newresiduals (enew′), which are provided as a new descriptor set (xnew′) ina next orthogonal component; repeatedly using said filtering of new datafor all estimated orthogonal components; calculating means for a neworthogonal descriptor set (xnewortho′=tnewortho*Portho′) as the outerproduct of the new orthogonal descriptor score set (tnewortho) and theorthogonal descriptor loading set (portho′), calculating a neworthogonal principal component score set (tnewpcaortho) from theprojection of the new orthogonal descriptor set onto the principalcomponent analysis loading set (xnewortho′*Ppcaortho′), whereby the newprincipal component analysis residuals formed(enewpcaortho=xnewortho′−tnewpcaortho*Ppcaortho′) are added back intothe new residuals (enew′) if principal component analysis was used onthe orthogonal descriptor set (Xortho), and only removing the outerproduct of the principal component analysis score sets and the principalcomponents loading set (Tpcaortho*Ppcaortho′) was removed from theoriginal descriptor set (X); and for multiple concentration or propertysets (Y), calculating a principal component analysis model on saidproperty sets (Y=TP′+E) and repeatedly using the above means for eachseparate principal component analysis score set (t) and using theorthogonal descriptor (X_(ortho)) as the input descriptor set (X) foreach subsequent principal component analysis score set (t), thus makingup said filtering model for filtering of further samples of the sametype.
 8. A method according to claim 7, characterized in that: partialleast square analysis means for the filtered residuals data (E) and theconcentration or property set (y, Y), and for said filtered newresiduals set (enew′) as prediction set.
 9. An arrangement according toclaim 7-8, characterized in that by finding said orthogonal componentsfor each component separately an amount of disturbing variation in eachpartial least square component can be analyzed by said analyzing means.10. An arrangement according to claim 7-9, characterized in that it usescrossvalidation and/or eigenvalue criteria for reducing overfitting. 11.An arrangement according to claim 7-10, characterized in that saidprincipal component analysis (PICA) components are chosen according toan crossvalidation or eigenvalue criteria by said analyzing means. 12.An arrangement according to claim 7-11, characterized in that it isdesigned to remove specific types of variation in the descriptor set(X), when an unwanted or non-relevant concentration or property set (y)exist by using the orthogonal descriptor (X_(ortho)) as a data set ofinterest, as it contains no correlated variation to the concentration orproperty set (y, Y).